Unlabeled Data and Multiple Views
نویسنده
چکیده
In many real-world applications there are usually abundant unlabeled data but the amount of labeled training examples are often limited, since labeling the data requires extensive human effort and expertise. Thus, exploiting unlabeled data to help improve the learning performance has attracted significant attention. Major techniques for this purpose include semi-supervised learning and active learning. These techniques were initially developed for data with a single view, that is, a single feature set ; while recent studies showed that for multi-view data, semi-supervised learning and active learning can amazingly well. This article briefly reviews some recent advances of this thread of research.
منابع مشابه
Detecting Changes in Unlabeled Data Streams Using Martingale
The martingale framework for detecting changes in data stream, currently only applicable to labeled data, is extended here to unlabeled data using clustering concept. The one-pass incremental changedetection algorithm (i) does not require a sliding window on the data stream, (ii) does not require monitoring the performance of the clustering algorithm as data points are streaming, and (iii) work...
متن کاملRobust Multi-View Boosting with Priors
Many learning tasks for computer vision problems can be described by multiple views or multiple features. These views can be exploited in order to learn from unlabeled data, a.k.a. “multi-view learning”. In these methods, usually the classifiers iteratively label each other a subset of the unlabeled data and ignore the rest. In this work, we propose a new multi-view boosting algorithm that, unl...
متن کاملLearning with Weak Views Based on Dependence Maximization Dimensionality Reduction
Large number of applications involving multiple views of data are coming into use, e.g., reporting news on the Internet by both text and video, identifying a person by both fingerprints and face images, etc. Meanwhile, labeling these data needs expensive efforts and thus most data are left unlabeled in many applications. Co-training can exploit the information of unlabeled data in multi-view sc...
متن کاملMulti-view based unlabeled data selection using feature transformation methods for semiboost learning
SemiBoost [23] is a boosting framework for semi-supervised learning, in which unlabeled data as well as labeled data both contribute to learning. Various strategies have been proposed in the literature to perform the task of selecting useful unlabeled data in SemiBoost. Recently, a multi-view based strategy was proposed in [20], in which the feature set of the data is decomposed into subsets (i...
متن کاملA Co-training based Framework for Writer Identification in Offline Handwriting
Traditional forensic document analysis methods have focused on feature-classification paradigm where a machine learning based classifier is used to learn discrimination among multiple writers. However, usage of such techniques is restricted to availability of a large labeled dataset which is not always feasible. In this paper, we propose a Cotraining based approach that overcomes this limitatio...
متن کامل